Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 856 |
| Missing cells | 1288 |
| Missing cells (%) | 13.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 73.7 KiB |
| Average record size in memory | 88.1 B |
Variable types
| Categorical | 4 |
|---|---|
| Numeric | 7 |
Country has a high cardinality: 107 distinct values | High cardinality |
No. of cases has a high cardinality: 695 distinct values | High cardinality |
No. of deaths has a high cardinality: 519 distinct values | High cardinality |
No. of cases_median is highly correlated with No. of cases_min and 4 other fields | High correlation |
No. of cases_min is highly correlated with No. of cases_median and 4 other fields | High correlation |
No. of cases_max is highly correlated with No. of cases_median and 4 other fields | High correlation |
No. of deaths_median is highly correlated with No. of cases_median and 4 other fields | High correlation |
No. of deaths_min is highly correlated with No. of cases_median and 4 other fields | High correlation |
No. of deaths_max is highly correlated with No. of cases_median and 4 other fields | High correlation |
No. of cases_min has 312 (36.4%) missing values | Missing |
No. of cases_max has 312 (36.4%) missing values | Missing |
No. of deaths_min has 332 (38.8%) missing values | Missing |
No. of deaths_max has 332 (38.8%) missing values | Missing |
Country is uniformly distributed | Uniform |
No. of cases_median has 136 (15.9%) zeros | Zeros |
No. of deaths_median has 265 (31.0%) zeros | Zeros |
No. of deaths_min has 58 (6.8%) zeros | Zeros |
Reproduction
| Analysis started | 2021-03-08 13:26:18.473654 |
|---|---|
| Analysis finished | 2021-03-08 13:26:30.141782 |
| Duration | 11.67 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
| Distinct | 107 |
|---|---|
| Distinct (%) | 12.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.8 KiB |
| Cabo Verde | 8 |
|---|---|
| Mauritania | 8 |
| Belize | 8 |
| Armenia | 8 |
| Timor-Leste | 8 |
| Other values (102) |
Length
| Max length | 37 |
|---|---|
| Median length | 8 |
| Mean length | 10.11214953 |
| Min length | 4 |
Characters and Unicode
| Total characters | 8656 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Afghanistan |
|---|---|
| 2nd row | Algeria |
| 3rd row | Angola |
| 4th row | Argentina |
| 5th row | Armenia |
| Value | Count | Frequency (%) |
| Cabo Verde | 8 | 0.9% |
| Mauritania | 8 | 0.9% |
| Belize | 8 | 0.9% |
| Armenia | 8 | 0.9% |
| Timor-Leste | 8 | 0.9% |
| Thailand | 8 | 0.9% |
| Cameroon | 8 | 0.9% |
| Haiti | 8 | 0.9% |
| Solomon Islands | 8 | 0.9% |
| Papua New Guinea | 8 | 0.9% |
| Other values (97) | 776 |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| republic | 80 | 6.4% |
| of | 56 | 4.5% |
| guinea | 24 | 1.9% |
| democratic | 24 | 1.9% |
| united | 16 | 1.3% |
| congo | 16 | 1.3% |
| arab | 16 | 1.3% |
| people's | 16 | 1.3% |
| south | 16 | 1.3% |
| sudan | 16 | 1.3% |
| Other values (121) | 976 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1208 | 14.0% |
| i | 792 | 9.1% |
| e | 608 | 7.0% |
| n | 576 | 6.7% |
| o | 504 | 5.8% |
| r | 432 | 5.0% |
| 400 | 4.6% | |
| u | 344 | 4.0% |
| l | 312 | 3.6% |
| t | 296 | 3.4% |
| Other values (44) | 3184 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6968 | |
| Uppercase Letter | 1200 | 13.9% |
| Space Separator | 400 | 4.6% |
| Open Punctuation | 24 | 0.3% |
| Close Punctuation | 24 | 0.3% |
| Other Punctuation | 24 | 0.3% |
| Dash Punctuation | 16 | 0.2% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 1208 | |
| i | 792 | |
| e | 608 | 8.7% |
| n | 576 | 8.3% |
| o | 504 | 7.2% |
| r | 432 | 6.2% |
| u | 344 | 4.9% |
| l | 312 | 4.5% |
| t | 296 | 4.2% |
| m | 224 | 3.2% |
| Other values (17) | 1672 |
| Value | Count | Frequency (%) |
| S | 120 | 10.0% |
| C | 96 | 8.0% |
| R | 96 | 8.0% |
| A | 88 | 7.3% |
| B | 88 | 7.3% |
| P | 80 | 6.7% |
| G | 80 | 6.7% |
| M | 72 | 6.0% |
| E | 64 | 5.3% |
| T | 64 | 5.3% |
| Other values (12) | 352 |
| Value | Count | Frequency (%) |
| 400 |
| Value | Count | Frequency (%) |
| ( | 24 |
| Value | Count | Frequency (%) |
| ) | 24 |
| Value | Count | Frequency (%) |
| ' | 24 |
| Value | Count | Frequency (%) |
| - | 16 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8168 | |
| Common | 488 | 5.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 1208 | |
| i | 792 | 9.7% |
| e | 608 | 7.4% |
| n | 576 | 7.1% |
| o | 504 | 6.2% |
| r | 432 | 5.3% |
| u | 344 | 4.2% |
| l | 312 | 3.8% |
| t | 296 | 3.6% |
| m | 224 | 2.7% |
| Other values (39) | 2872 |
| Value | Count | Frequency (%) |
| 400 | ||
| ( | 24 | 4.9% |
| ) | 24 | 4.9% |
| ' | 24 | 4.9% |
| - | 16 | 3.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8648 | |
| None | 8 | 0.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 1208 | 14.0% |
| i | 792 | 9.2% |
| e | 608 | 7.0% |
| n | 576 | 6.7% |
| o | 504 | 5.8% |
| r | 432 | 5.0% |
| 400 | 4.6% | |
| u | 344 | 4.0% |
| l | 312 | 3.6% |
| t | 296 | 3.4% |
| Other values (43) | 3176 |
| Value | Count | Frequency (%) |
| ô | 8 |
Year
Real number (ℝ≥0)
| Distinct | 8 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2013.5 |
|---|---|
| Minimum | 2010 |
| Maximum | 2017 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 2010 |
|---|---|
| 5-th percentile | 2010 |
| Q1 | 2011.75 |
| median | 2013.5 |
| Q3 | 2015.25 |
| 95-th percentile | 2017 |
| Maximum | 2017 |
| Range | 7 |
| Interquartile range (IQR) | 3.5 |
Descriptive statistics
| Standard deviation | 2.29262739 |
|---|---|
| Coefficient of variation (CV) | 0.001138627956 |
| Kurtosis | -1.238315402 |
| Mean | 2013.5 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0 |
| Sum | 1723556 |
| Variance | 5.256140351 |
| Monotocity | Decreasing |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) |
| 2010 | 107 | |
| 2011 | 107 | |
| 2012 | 107 | |
| 2013 | 107 | |
| 2014 | 107 | |
| 2015 | 107 | |
| 2016 | 107 | |
| 2017 | 107 |
| Value | Count | Frequency (%) |
| 2010 | 107 | |
| 2011 | 107 | |
| 2012 | 107 | |
| 2013 | 107 | |
| 2014 | 107 |
| Value | Count | Frequency (%) |
| 2017 | 107 | |
| 2016 | 107 | |
| 2015 | 107 | |
| 2014 | 107 | |
| 2013 | 107 |
| Distinct | 695 |
|---|---|
| Distinct (%) | 81.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.8 KiB |
| 0 | |
|---|---|
| 3 | 6 |
| 7 | 4 |
| 1 | 4 |
| 19 | 3 |
| Other values (690) |
Length
| Max length | 27 |
|---|---|
| Median length | 18 |
| Mean length | 14.21962617 |
| Min length | 1 |
Characters and Unicode
| Total characters | 12172 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 679 ? |
|---|---|
| Unique (%) | 79.3% |
Sample
| 1st row | 630308[495000-801000] |
|---|---|
| 2nd row | 0 |
| 3rd row | 4615605[3106000-6661000] |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 136 | 15.9% |
| 3 | 6 | 0.7% |
| 7 | 4 | 0.5% |
| 1 | 4 | 0.5% |
| 19 | 3 | 0.4% |
| 4 | 3 | 0.4% |
| 6 | 3 | 0.4% |
| 2 | 2 | 0.2% |
| 12 | 2 | 0.2% |
| 22 | 2 | 0.2% |
| Other values (685) | 691 |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 0 | 136 | 15.9% |
| 3 | 6 | 0.7% |
| 7 | 4 | 0.5% |
| 1 | 4 | 0.5% |
| 6 | 3 | 0.4% |
| 19 | 3 | 0.4% |
| 4 | 3 | 0.4% |
| 12 | 2 | 0.2% |
| 15 | 2 | 0.2% |
| 81 | 2 | 0.2% |
| Other values (685) | 691 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 3950 | |
| 1 | 1063 | 8.7% |
| 2 | 829 | 6.8% |
| 3 | 794 | 6.5% |
| 4 | 746 | 6.1% |
| 5 | 687 | 5.6% |
| 6 | 655 | 5.4% |
| 7 | 624 | 5.1% |
| 8 | 615 | 5.1% |
| 9 | 577 | 4.7% |
| Other values (3) | 1632 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 10540 | |
| Open Punctuation | 544 | 4.5% |
| Dash Punctuation | 544 | 4.5% |
| Close Punctuation | 544 | 4.5% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 3950 | |
| 1 | 1063 | 10.1% |
| 2 | 829 | 7.9% |
| 3 | 794 | 7.5% |
| 4 | 746 | 7.1% |
| 5 | 687 | 6.5% |
| 6 | 655 | 6.2% |
| 7 | 624 | 5.9% |
| 8 | 615 | 5.8% |
| 9 | 577 | 5.5% |
| Value | Count | Frequency (%) |
| [ | 544 |
| Value | Count | Frequency (%) |
| - | 544 |
| Value | Count | Frequency (%) |
| ] | 544 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 12172 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 3950 | |
| 1 | 1063 | 8.7% |
| 2 | 829 | 6.8% |
| 3 | 794 | 6.5% |
| 4 | 746 | 6.1% |
| 5 | 687 | 5.6% |
| 6 | 655 | 5.4% |
| 7 | 624 | 5.1% |
| 8 | 615 | 5.1% |
| 9 | 577 | 4.7% |
| Other values (3) | 1632 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 12172 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 3950 | |
| 1 | 1063 | 8.7% |
| 2 | 829 | 6.8% |
| 3 | 794 | 6.5% |
| 4 | 746 | 6.1% |
| 5 | 687 | 5.6% |
| 6 | 655 | 5.4% |
| 7 | 624 | 5.1% |
| 8 | 615 | 5.1% |
| 9 | 577 | 4.7% |
| Other values (3) | 1632 |
| Distinct | 519 |
|---|---|
| Distinct (%) | 60.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.8 KiB |
| 0 | |
|---|---|
| 1 | 15 |
| 0[0-1] | 8 |
| 1[0-2] | 7 |
| 2[0-4] | 7 |
| Other values (514) |
Length
| Max length | 21 |
|---|---|
| Median length | 8 |
| Mean length | 8.373831776 |
| Min length | 1 |
Characters and Unicode
| Total characters | 7168 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 486 ? |
|---|---|
| Unique (%) | 56.8% |
Sample
| 1st row | 298[110-510] |
|---|---|
| 2nd row | 0 |
| 3rd row | 13316[9970-16600] |
| 4th row | 0 |
| 5th row | 0 |
| Value | Count | Frequency (%) |
| 0 | 257 | |
| 1 | 15 | 1.8% |
| 0[0-1] | 8 | 0.9% |
| 1[0-2] | 7 | 0.8% |
| 2[0-4] | 7 | 0.8% |
| 2 | 6 | 0.7% |
| 1[0-3] | 6 | 0.7% |
| 4 | 6 | 0.7% |
| 5[1-9] | 4 | 0.5% |
| 10 | 4 | 0.5% |
| Other values (509) | 536 |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 0 | 257 | |
| 1 | 15 | 1.8% |
| 0[0-1 | 8 | 0.9% |
| 2[0-4 | 7 | 0.8% |
| 1[0-2 | 7 | 0.8% |
| 1[0-3 | 6 | 0.7% |
| 4 | 6 | 0.7% |
| 2 | 6 | 0.7% |
| 5[1-9 | 4 | 0.5% |
| 10 | 4 | 0.5% |
| Other values (509) | 536 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1710 | |
| 1 | 784 | |
| 2 | 555 | 7.7% |
| [ | 524 | 7.3% |
| - | 524 | 7.3% |
| ] | 524 | 7.3% |
| 4 | 417 | 5.8% |
| 3 | 412 | 5.7% |
| 6 | 386 | 5.4% |
| 5 | 361 | 5.0% |
| Other values (3) | 971 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5596 | |
| Open Punctuation | 524 | 7.3% |
| Dash Punctuation | 524 | 7.3% |
| Close Punctuation | 524 | 7.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 1710 | |
| 1 | 784 | |
| 2 | 555 | 9.9% |
| 4 | 417 | 7.5% |
| 3 | 412 | 7.4% |
| 6 | 386 | 6.9% |
| 5 | 361 | 6.5% |
| 7 | 345 | 6.2% |
| 8 | 320 | 5.7% |
| 9 | 306 | 5.5% |
| Value | Count | Frequency (%) |
| [ | 524 |
| Value | Count | Frequency (%) |
| - | 524 |
| Value | Count | Frequency (%) |
| ] | 524 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7168 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 1710 | |
| 1 | 784 | |
| 2 | 555 | 7.7% |
| [ | 524 | 7.3% |
| - | 524 | 7.3% |
| ] | 524 | 7.3% |
| 4 | 417 | 5.8% |
| 3 | 412 | 5.7% |
| 6 | 386 | 5.4% |
| 5 | 361 | 5.0% |
| Other values (3) | 971 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7168 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 1710 | |
| 1 | 784 | |
| 2 | 555 | 7.7% |
| [ | 524 | 7.3% |
| - | 524 | 7.3% |
| ] | 524 | 7.3% |
| 4 | 417 | 5.8% |
| 3 | 412 | 5.7% |
| 6 | 386 | 5.4% |
| 5 | 361 | 5.0% |
| Other values (3) | 971 |
| Distinct | 694 |
|---|---|
| Distinct (%) | 81.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2081990.36 |
|---|---|
| Minimum | 0 |
| Maximum | 62020888 |
| Zeros | 136 |
| Zeros (%) | 15.9% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 238.5 |
| median | 37521 |
| Q3 | 1656628.25 |
| 95-th percentile | 8479671.5 |
| Maximum | 62020888 |
| Range | 62020888 |
| Interquartile range (IQR) | 1656389.75 |
Descriptive statistics
| Standard deviation | 6381892.5 |
|---|---|
| Coefficient of variation (CV) | 3.065284366 |
| Kurtosis | 55.06562959 |
| Mean | 2081990.36 |
| Median Absolute Deviation (MAD) | 37521 |
| Skewness | 6.830041648 |
| Sum | 1782183748 |
| Variance | 4.072855188 × 1013 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 136 | 15.9% |
| 3 | 6 | 0.7% |
| 1 | 4 | 0.5% |
| 7 | 4 | 0.5% |
| 4 | 3 | 0.4% |
| 6 | 3 | 0.4% |
| 19 | 3 | 0.4% |
| 34 | 2 | 0.2% |
| 242 | 2 | 0.2% |
| 81 | 2 | 0.2% |
| Other values (684) | 691 |
| Value | Count | Frequency (%) |
| 0 | 136 | |
| 1 | 4 | 0.5% |
| 2 | 2 | 0.2% |
| 3 | 6 | 0.7% |
| 4 | 3 | 0.4% |
| Value | Count | Frequency (%) |
| 62020888 | 1 | |
| 61587135 | 1 | |
| 60749349 | 1 | |
| 60529456 | 1 | |
| 59365039 | 1 |
| Distinct | 448 |
|---|---|
| Distinct (%) | 82.4% |
| Missing | 312 |
| Missing (%) | 36.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2157556.342 |
|---|---|
| Minimum | 30 |
| Maximum | 43880000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 30 |
|---|---|
| 5-th percentile | 1200 |
| Q1 | 39000 |
| median | 498000 |
| Q3 | 2084500 |
| 95-th percentile | 7118900 |
| Maximum | 43880000 |
| Range | 43879970 |
| Interquartile range (IQR) | 2045500 |
Descriptive statistics
| Standard deviation | 5384821.887 |
|---|---|
| Coefficient of variation (CV) | 2.495796649 |
| Kurtosis | 37.59826868 |
| Mean | 2157556.342 |
| Median Absolute Deviation (MAD) | 490800 |
| Skewness | 5.713503985 |
| Sum | 1173710650 |
| Variance | 2.899630676 × 1013 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 33000 | 6 | 0.7% |
| 21000 | 5 | 0.6% |
| 22000 | 4 | 0.5% |
| 60000 | 4 | 0.5% |
| 11000 | 3 | 0.4% |
| 19000 | 3 | 0.4% |
| 606000 | 3 | 0.4% |
| 15000 | 3 | 0.4% |
| 49000 | 3 | 0.4% |
| 145000 | 3 | 0.4% |
| Other values (438) | 507 | |
| (Missing) | 312 |
| Value | Count | Frequency (%) |
| 30 | 1 | |
| 120 | 2 | |
| 230 | 1 | |
| 360 | 2 | |
| 400 | 1 |
| Value | Count | Frequency (%) |
| 43880000 | 1 | |
| 43800000 | 1 | |
| 43510000 | 1 | |
| 43310000 | 1 | |
| 41180000 | 1 |
| Distinct | 481 |
|---|---|
| Distinct (%) | 88.4% |
| Missing | 312 |
| Missing (%) | 36.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4913740.919 |
|---|---|
| Minimum | 40 |
| Maximum | 84840000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 40 |
|---|---|
| 5-th percentile | 1715 |
| Q1 | 75000 |
| median | 1389000 |
| Q3 | 5277750 |
| 95-th percentile | 15450000 |
| Maximum | 84840000 |
| Range | 84839960 |
| Interquartile range (IQR) | 5202750 |
Descriptive statistics
| Standard deviation | 11027731.66 |
|---|---|
| Coefficient of variation (CV) | 2.244263962 |
| Kurtosis | 30.85611056 |
| Mean | 4913740.919 |
| Median Absolute Deviation (MAD) | 1379200 |
| Skewness | 5.127501966 |
| Sum | 2673075060 |
| Variance | 1.216108656 × 1014 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 12000 | 5 | 0.6% |
| 12230000 | 4 | 0.5% |
| 1000 | 3 | 0.4% |
| 15000 | 3 | 0.4% |
| 16000 | 3 | 0.4% |
| 34000 | 3 | 0.4% |
| 10000 | 3 | 0.4% |
| 75000 | 3 | 0.4% |
| 144000 | 3 | 0.4% |
| 27000 | 3 | 0.4% |
| Other values (471) | 511 | |
| (Missing) | 312 |
| Value | Count | Frequency (%) |
| 40 | 1 | |
| 160 | 1 | |
| 170 | 1 | |
| 400 | 1 | |
| 430 | 1 |
| Value | Count | Frequency (%) |
| 84840000 | 1 | |
| 83800000 | 1 | |
| 83240000 | 1 | |
| 82700000 | 1 | |
| 81580000 | 1 |
| Distinct | 447 |
|---|---|
| Distinct (%) | 52.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4713.880841 |
|---|---|
| Minimum | 0 |
| Maximum | 146734 |
| Zeros | 265 |
| Zeros (%) | 31.0% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 55.5 |
| Q3 | 4096 |
| 95-th percentile | 20372.5 |
| Maximum | 146734 |
| Range | 146734 |
| Interquartile range (IQR) | 4096 |
Descriptive statistics
| Standard deviation | 13183.31289 |
|---|---|
| Coefficient of variation (CV) | 2.796700497 |
| Kurtosis | 51.6589746 |
| Mean | 4713.880841 |
| Median Absolute Deviation (MAD) | 55.5 |
| Skewness | 6.369920518 |
| Sum | 4035082 |
| Variance | 173799738.7 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 32 | 3.7% |
| 2 | 21 | 2.5% |
| 3 | 9 | 1.1% |
| 5 | 9 | 1.1% |
| 7 | 9 | 1.1% |
| 10 | 8 | 0.9% |
| 4 | 8 | 0.9% |
| 6 | 5 | 0.6% |
| 34 | 4 | 0.5% |
| Other values (437) | 486 |
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 32 | 3.7% |
| 2 | 21 | 2.5% |
| 3 | 9 | 1.1% |
| 4 | 8 | 0.9% |
| Value | Count | Frequency (%) |
| 146734 | 1 | |
| 136533 | 1 | |
| 125290 | 1 | |
| 116472 | 1 | |
| 107843 | 1 |
| Distinct | 255 |
|---|---|
| Distinct (%) | 48.7% |
| Missing | 332 |
| Missing (%) | 38.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5619.108779 |
|---|---|
| Minimum | 0 |
| Maximum | 115000 |
| Zeros | 58 |
| Zeros (%) | 6.8% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 390 |
| Q3 | 6592.5 |
| 95-th percentile | 19500 |
| Maximum | 115000 |
| Range | 115000 |
| Interquartile range (IQR) | 6587.5 |
Descriptive statistics
| Standard deviation | 12823.71424 |
|---|---|
| Coefficient of variation (CV) | 2.282161593 |
| Kurtosis | 31.70327045 |
| Mean | 5619.108779 |
| Median Absolute Deviation (MAD) | 390 |
| Skewness | 5.036093056 |
| Sum | 2944413 |
| Variance | 164447646.9 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 58 | 6.8% |
| 1 | 25 | 2.9% |
| 2 | 20 | 2.3% |
| 3 | 15 | 1.8% |
| 10 | 13 | 1.5% |
| 60 | 10 | 1.2% |
| 80 | 9 | 1.1% |
| 70 | 9 | 1.1% |
| 20 | 9 | 1.1% |
| 5 | 9 | 1.1% |
| Other values (245) | 347 | |
| (Missing) | 332 |
| Value | Count | Frequency (%) |
| 0 | 58 | |
| 1 | 25 | |
| 2 | 20 | 2.3% |
| 3 | 15 | 1.8% |
| 4 | 7 | 0.8% |
| Value | Count | Frequency (%) |
| 115000 | 1 | |
| 107000 | 1 | |
| 98100 | 1 | |
| 91200 | 1 | |
| 84600 | 1 |
| Distinct | 336 |
|---|---|
| Distinct (%) | 64.1% |
| Missing | 332 |
| Missing (%) | 38.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10149.42939 |
|---|---|
| Minimum | 1 |
| Maximum | 179000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3.15 |
| Q1 | 180 |
| median | 3565 |
| Q3 | 12400 |
| 95-th percentile | 41670 |
| Maximum | 179000 |
| Range | 178999 |
| Interquartile range (IQR) | 12220 |
Descriptive statistics
| Standard deviation | 20173.78393 |
|---|---|
| Coefficient of variation (CV) | 1.987676662 |
| Kurtosis | 28.83634105 |
| Mean | 10149.42939 |
| Median Absolute Deviation (MAD) | 3545 |
| Skewness | 4.759147803 |
| Sum | 5318301 |
| Variance | 406981558.2 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 20 | 12 | 1.4% |
| 1 | 11 | 1.3% |
| 4 | 9 | 1.1% |
| 3 | 9 | 1.1% |
| 10 | 9 | 1.1% |
| 5 | 7 | 0.8% |
| 180 | 7 | 0.8% |
| 2 | 7 | 0.8% |
| 60 | 7 | 0.8% |
| 760 | 6 | 0.7% |
| Other values (326) | 440 | |
| (Missing) | 332 |
| Value | Count | Frequency (%) |
| 1 | 11 | |
| 2 | 7 | |
| 3 | 9 | |
| 4 | 9 | |
| 5 | 7 |
| Value | Count | Frequency (%) |
| 179000 | 1 | |
| 166000 | 1 | |
| 152000 | 1 | |
| 142000 | 1 | |
| 131000 | 1 |
WHO Region
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.8 KiB |
| Africa | |
|---|---|
| Americas | |
| Eastern Mediterranean | |
| Western Pacific | |
| Europe |
Length
| Max length | 21 |
|---|---|
| Median length | 8 |
| Mean length | 10.03738318 |
| Min length | 6 |
Characters and Unicode
| Total characters | 8592 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Eastern Mediterranean |
|---|---|
| 2nd row | Africa |
| 3rd row | Africa |
| 4th row | Americas |
| 5th row | Europe |
| Value | Count | Frequency (%) |
| Africa | 344 | |
| Americas | 168 | |
| Eastern Mediterranean | 112 | 13.1% |
| Western Pacific | 88 | 10.3% |
| Europe | 72 | 8.4% |
| South-East Asia | 72 | 8.4% |
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| africa | 344 | |
| americas | 168 | |
| mediterranean | 112 | 9.9% |
| eastern | 112 | 9.9% |
| pacific | 88 | 7.8% |
| western | 88 | 7.8% |
| asia | 72 | 6.4% |
| europe | 72 | 6.4% |
| south-east | 72 | 6.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1080 | |
| r | 1008 | |
| i | 872 | |
| e | 864 | |
| c | 688 | |
| A | 584 | 6.8% |
| s | 512 | 6.0% |
| t | 456 | 5.3% |
| f | 432 | 5.0% |
| n | 424 | 4.9% |
| Other values (13) | 1672 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7048 | |
| Uppercase Letter | 1200 | 14.0% |
| Space Separator | 272 | 3.2% |
| Dash Punctuation | 72 | 0.8% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 1080 | |
| r | 1008 | |
| i | 872 | |
| e | 864 | |
| c | 688 | |
| s | 512 | |
| t | 456 | |
| f | 432 | 6.1% |
| n | 424 | 6.0% |
| m | 168 | 2.4% |
| Other values (5) | 544 |
| Value | Count | Frequency (%) |
| A | 584 | |
| E | 256 | |
| M | 112 | 9.3% |
| W | 88 | 7.3% |
| P | 88 | 7.3% |
| S | 72 | 6.0% |
| Value | Count | Frequency (%) |
| 272 |
| Value | Count | Frequency (%) |
| - | 72 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8248 | |
| Common | 344 | 4.0% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 1080 | |
| r | 1008 | |
| i | 872 | |
| e | 864 | |
| c | 688 | |
| A | 584 | |
| s | 512 | 6.2% |
| t | 456 | 5.5% |
| f | 432 | 5.2% |
| n | 424 | 5.1% |
| Other values (11) | 1328 |
| Value | Count | Frequency (%) |
| 272 | ||
| - | 72 | 20.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8592 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 1080 | |
| r | 1008 | |
| i | 872 | |
| e | 864 | |
| c | 688 | |
| A | 584 | 6.8% |
| s | 512 | 6.0% |
| t | 456 | 5.3% |
| f | 432 | 5.0% |
| n | 424 | 4.9% |
| Other values (13) | 1672 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.
First rows
| Country | Year | No. of cases | No. of deaths | No. of cases_median | No. of cases_min | No. of cases_max | No. of deaths_median | No. of deaths_min | No. of deaths_max | WHO Region | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | 2017 | 630308[495000-801000] | 298[110-510] | 630308 | 495000.0 | 801000.0 | 298 | 110.0 | 510.0 | Eastern Mediterranean |
| 1 | Algeria | 2017 | 0 | 0 | 0 | NaN | NaN | 0 | NaN | NaN | Africa |
| 2 | Angola | 2017 | 4615605[3106000-6661000] | 13316[9970-16600] | 4615605 | 3106000.0 | 6661000.0 | 13316 | 9970.0 | 16600.0 | Africa |
| 3 | Argentina | 2017 | 0 | 0 | 0 | NaN | NaN | 0 | NaN | NaN | Americas |
| 4 | Armenia | 2017 | 0 | 0 | 0 | NaN | NaN | 0 | NaN | NaN | Europe |
| 5 | Azerbaijan | 2017 | 0 | 0 | 0 | NaN | NaN | 0 | NaN | NaN | Europe |
| 6 | Bangladesh | 2017 | 32924[30000-36000] | 76[3-130] | 32924 | 30000.0 | 36000.0 | 76 | 3.0 | 130.0 | South-East Asia |
| 7 | Belize | 2017 | 7 | 0 | 7 | NaN | NaN | 0 | NaN | NaN | Americas |
| 8 | Benin | 2017 | 4111699[2774000-6552000] | 7328[5740-8920] | 4111699 | 2774000.0 | 6552000.0 | 7328 | 5740.0 | 8920.0 | Africa |
| 9 | Bhutan | 2017 | 11 | 0 | 11 | NaN | NaN | 0 | NaN | NaN | South-East Asia |
Last rows
| Country | Year | No. of cases | No. of deaths | No. of cases_median | No. of cases_min | No. of cases_max | No. of deaths_median | No. of deaths_min | No. of deaths_max | WHO Region | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 846 | Uganda | 2010 | 11503116[7618000-17700000] | 21558[17200-26000] | 11503116 | 7618000.0 | 17700000.0 | 21558 | 17200.0 | 26000.0 | Africa |
| 847 | United Arab Emirates | 2010 | 0 | 0 | 0 | NaN | NaN | 0 | NaN | NaN | Eastern Mediterranean |
| 848 | United Republic of Tanzania | 2010 | 6545932[3955000-9995000] | 20281[17600-23000] | 6545932 | 3955000.0 | 9995000.0 | 20281 | 17600.0 | 23000.0 | Africa |
| 849 | Uzbekistan | 2010 | 3 | 0 | 3 | NaN | NaN | 0 | NaN | NaN | Europe |
| 850 | Vanuatu | 2010 | 15695[12000-20000] | 20[2-40] | 15695 | 12000.0 | 20000.0 | 20 | 2.0 | 40.0 | Western Pacific |
| 851 | Venezuela (Bolivarian Republic of) | 2010 | 57257[47000-74000] | 52[9-90] | 57257 | 47000.0 | 74000.0 | 52 | 9.0 | 90.0 | Americas |
| 852 | Viet Nam | 2010 | 23062[21000-26000] | 45[2-80] | 23062 | 21000.0 | 26000.0 | 45 | 2.0 | 80.0 | Western Pacific |
| 853 | Yemen | 2010 | 1134927[611000-2686000] | 2874[90-8490] | 1134927 | 611000.0 | 2686000.0 | 2874 | 90.0 | 8490.0 | Eastern Mediterranean |
| 854 | Zambia | 2010 | 2169307[1449000-3095000] | 6544[5580-7510] | 2169307 | 1449000.0 | 3095000.0 | 6544 | 5580.0 | 7510.0 | Africa |
| 855 | Zimbabwe | 2010 | 1095083[606000-1717000] | 2803[80-6190] | 1095083 | 606000.0 | 1717000.0 | 2803 | 80.0 | 6190.0 | Africa |